FaCT: A Flexible, Constant-Time Programming Language

 February 27, 2023 at 11:25 pm

Facts

Compiler optimization passes can introduce new vulnerabilities into correctly written code. For example, an optimization pass may introduce branching instructions in originally branchless C code [46].

Idea

Constant-time in(similar to) C is not very easy:

  • Compiler may optimize the CT instructions away with branches
  • CT code is not easy to write in C (and hard to read)
  • The generated CT code is not very efficient (e.g., not use CT-related instructions like cmov, adc)

FaCT also has a IFC label on data to determine is security level.

Constant-Time Foundations for the New Spectre Era

 February 27, 2023 at 5:48 pm

Abstract

The constant-time discipline is a software-based countermeasure used for protecting high assurance cryptographic implementations against timing side-channel attacks. Constant-time is effective (it protects against many known attacks), rigorous (it can be formalized using program semantics), and amenable to automated verification. Yet, the advent of micro-architectural attacks makes constant-time as it exists today far less useful.

This paper lays foundations for constant-time programming in the presence of speculative and out-of-order execution. We present an operational semantics and a formal definition of constant-time programs in this extended setting. Our semantics eschews formalization of microarchitectural features (that are instead assumed under adversary control), and yields a notion of constant-time that retains the elegance and tractability of the usual notion. We demonstrate the relevance of our semantics in two ways: First, by contrasting existing Spectre-like attacks with our definition of constant-time. Second, by implementing a static analysis tool, Pitchfork, which detects violations of our extended constant-time property in real world cryptographic libraries.

Methodology

  • A model to incorporate speculative execution.
  • Abstract machine: fetch, execute, retire.
  • Reorder buffer => out-of-order and speculative execution
  • Directives can be initiated to model the detailed schedule of the reordered micro-ops
  • Observations are also modeled in such process to mimic the leakage visible to the attacker
  • Modeled instructions: op, fence, load, store, br, call, ret

Artifact

  • Symbolic execution based on angr
  • Schedule the worst case reorder buffer => maximize potential leakage
  • Can indeed found threat

Comments

  • Are the threats confirmed by the developers?
  • Is the model formalized in a proof checker?
  • Execution time of instructions is not modeled? (or need not to be modeled?)

PrivGuard: Privacy Regulation Compliance Made Easier

 March 1, 2023 at 4:58 pm

Paper

Abstract

Continuous compliance with privacy regulations, such as GDPR and CCPA, has become a costly burden for companies from small-sized start-ups to business giants. The culprit is the heavy reliance on human auditing in today's compliance process, which is expensive, slow, and error-prone. To address the issue, we propose PrivGuard, a novel system design that reduces human participation required and improves the productivity of the compliance process. PrivGuard is mainly comprised of two components: (1) PrivAnalyzer, a static analyzer based on abstract interpretation for partly enforcing privacy regulations, and (2) a set of components providing strong security protection on the data throughout its life cycle. To validate the effectiveness of this approach, we prototype PrivGuard and integrate it into an industrial-level data governance platform. Our case studies and evaluation show that PrivGuard can correctly enforce the encoded privacy policies on real-world programs with reasonable performance overhead.

Methodology

Users can prescribe their privacy policies, and the analyst can then leverage user data for data analysis tasks. However, the difference privacy policies are automatically enforced and satisfied by PrivGuard, which is executed inside TEE.

The policy is prescribed in a formal language, and the data analysis program is statically analyzed by PrivAnalyzer to check privacy policy compliance. PrivAnalyzer use python interpreter as a abstract interpreter to check if the privacy policies might be broken by the program. Since the python program may use a lot of 3rd party libraries, the authors purpose functions summaries for these functions and over approximate the result.

Weakness

  • Intentional information leakage (analyst is assumed trusted)
  • Language level bypasses

Sidenote

Some papers mentioned in this work is also interesting, especially those related to dealing with loops and branches in static analysis.

This paper may need to be checked again!

Privado: Practical and Secure DNN Inference with Enclaves

 March 1, 2023 at 4:57 pm

Abstract

Cloud providers are extending support for trusted hardware primitives such as Intel SGX. Simultaneously, the field of deep learning is seeing enormous innovation as well as an increase in adoption. In this paper, we ask a timely question: "Can third-party cloud services use Intel SGX enclaves to provide practical, yet secure DNN Inference-as-a-service?" We first demonstrate that DNN models executing inside enclaves are vulnerable to access pattern based attacks. We show that by simply observing access patterns, an attacker can classify encrypted inputs with 97% and 71% attack accuracy for MNIST and CIFAR10 datasets on models trained to achieve 99% and 79% original accuracy respectively. This motivates the need for PRIVADO, a system we have designed for secure, easy-to-use, and performance efficient inference-as-a-service. PRIVADO is input-oblivious: it transforms any deep learning framework that is written in C/C++ to be free of input-dependent access patterns thus eliminating the leakage. PRIVADO is fully-automated and has a low TCB: with zero developer effort, given an ONNX description of a model, it generates compact and enclave-compatible code which can be deployed on an SGX cloud platform. PRIVADO incurs low performance overhead: we use PRIVADO with Torch framework and show its overhead to be 17.18% on average on 11 different contemporary neural networks.

Model

  • Service: In-enclave ML model. model is unpublic
  • Users: data providers, input & output are secret

Attack

Infer the output label from memory access trace collected when the user's input is processing.

  • DNN contains data-dependent branches
  • A ML model (linear reg) is built up from memory access traces and the output label
  • Can achieve high accuracy on inferring output tag from memory trace

Defense

  • Data-dependency usually occurs at activation functions (e.g. ReLU) and max pooling, etc. Other layers merely contains data-dependent memory access.
  • Eliminate the input/secret-dependency in the Torch library
  • End-to-end model compilation

Proof Complexity vs. Code Complexity

 December 26, 2022 at 9:52 pm

Potential Threats of Memory Integrity on SEV(SNP), (Scalable) SGX2, and TDX

 December 6, 2022 at 12:09 am

SGX2 Memory Integrity

Documents

 *

Potential Attacks

  • Inside-in Aliasing
  • Outside-in Aliasing

Possible sources of aliasing

Server’s RAS feature

  • Memory Address Range Mirroring
  • Memory Predictive Failure Analysis (PFA)

PFA: if a physical memory page is believed to be affected by an underlying hardware fault (e.g., a weak cell or faulty row in a memory chip or DRAM), the affected page can be retired by relocating its content to another physical page, and placing the retired page on a list of physical pages that should not be subsequently allocated by the virtual memory system.

Documents

Possible attack from OS?

Via Memory Components (System Software)

  • Program critical system hardware devices, e.g., memory controller, DMA engines (doc1, p113) DMA is controlled by the CPU in x86-64 systems
  • Program page tables/EPT => inside-in alias (doc1, p113)

Other possible attack

  • Firmware <= defend by secure boot/PFR/Intel Hardware Shield (doc.2)

Documents

  1. https://www.intel.com/content/dam/develop/external/us/en/documents/332680-001-720907.pdf
  2. Intel PFR Github

Related Papers

TDX Problems

References

  1. Intel TDX

Sealed-Glass Proofs: Using Transparent Enclaves to Prove and Sell Knowledge

 November 17, 2022 at 2:30 pm

Abstract

Trusted hardware systems, such as Intel's new SGX instruction set architecture extension, aim to provide strong confidentiality and integrity assurances for applications. Recent work, however, raises serious concerns about the vulnerability of such systems to side-channel attacks. We propose, formalize, and explore a cryptographic primitive called a Sealed-Glass Proof (SGP) that models computation possible in an isolated execution environment with unbounded leakage, and thus in the face of arbitrary side-channels. A SGP specifically models the capabilities of trusted hardware that can attest to correct execution of a piece of code, but whose execution is transparent, meaning that an application's secrets and state are visible to other processes on the same host. Despite this strong threat model, we show that SGPs enable a range of practical applications. Our key observation is that SGPs permit safe verifiable computing in zero-knowledge, as data leakage results only in the prover learning her own secrets. Among other applications, we describe the implementation of an end-to-end bug bounty (or zero-day solicitation) platform that couples a SGX-based SGP with a smart contract. Our platform enables a marketplace that achieves fair exchange, protects against unfair bounty withdrawals, and resists denial-of-service attacks by dishonest sellers. We also consider a slight relaxation of the SGP model that permits black-box modules instantiating minimal, side-channel resistant primitives, yielding a still broader range of applications. Our work shows how trusted hardware systems such as SGX can support trustworthy applications even in the presence of side channels.

Overview

  • Transparent enclave execution (protects integrity)
  • Sealed-Glass Proofs -> verifiable computing, commitment schemes, ZK proofs
  • SGP -> smart contract -> knowledge marketplace -> bug bounty

Important building blocks

Important Gadgets

  • Secret (of the prover) is processed on prover's host, therefore the system is resistant to side-channel
  • Multiple provers is also permitted by an extended design
  • A bug bounty mechanism needs careful design to protect the buyer and sellers

A Systematic Look at Ciphertext Side Channels on AMD SEV-SNP

 November 11, 2022 at 10:29 am

Abstract

Hardware-assisted memory encryption offers strong confidentiality guarantees for trusted execution environments like Intel SGX and AMD SEV. However, a recent study by Li et al. presented at USENIX Security 2021 has demonstrated the CipherLeaks attack, which monitors ciphertext changes in the special VMSA page. By leaking register values saved by the VM during context switches, they broke state-of-the-art constant-time cryptographic implementations, including RSA and ECDSA in the OpenSSL. In this paper, we perform a comprehensive study on the ciphertext side channels. Our work suggests that while the CipherLeaks attack targets only the VMSA page, a generic ciphertext side-channel attack may exploit the ciphertext leakage from any memory pages, including those for kernel data structures, stacks and heaps. As such, AMD’s existing countermeasures to the CipherLeaks attack, a firmware patch that introduces randomness into the ciphertext of the VMSA page, is clearly insufficient. The root cause of the leakage in AMD SEV’s memory encryption—the use of a stateless yet unauthenticated encryption mode and the unrestricted read accesses to the ciphertext of the encrypted memory—remains unfixed. Given the challenges faced by AMD to eradicate the vulnerability from the hardware design, we propose a set of software countermeasures to the ciphertext side channels, including patches to the OS kernel and cryptographic libraries. We are working closely with AMD to merge these changes into affected open-source projects.

Paper

Background

  • CipherLeak: ciphertext can be accessed by the hypervisor
  • In SEV, XEX encryption mode is applied => for a fixed address, same plaintext yields same ciphertext
  • Controlled side-channel: NPT (nested page table) present bit clear => PF
  • Before SEV-ES, the registers are saved without encryption
  • SNP: hypervisor cannot modify or remap guest VM pages (integrity protection)

Attack

  • Nginx SSL key generation -> 384bit ECDSA key recovery

This exploits constant time swap algorithm. A decision bit encryption pattern is observed, and therefore the nonce could be derived by observing the mask in 384 iterations.

Leaky DNN: Stealing Deep-learning Model Secret with GPU Context-switching Side-channel

 November 8, 2022 at 9:58 pm

Abstract

Machine learning has been attracting strong interests in recent years. Numerous companies have invested great efforts and resources to develop customized deep-learning models, which are their key intellectual properties. In this work, we investigate to what extent the secret of deep-learning models can be inferred by attackers. In particular, we focus on the scenario that a model developer and an adversary share the same GPU when training a Deep Neural Network (DNN) model. We exploit the GPU side-channel based on context-switching penalties. This side-channel allows us to extract the fine-grained structural secret of a DNN model, including its layer composition and hyper-parameters. Leveraging this side-channel, we developed an attack prototype named MosConS, which applies LSTM-based inference models to identify the structural secret. Our evaluation of MosConS shows the structural information can be accurately recovered. Therefore, we believe new defense mechanisms should be developed to protect training against the GPU side-channel.

Method

  • Shared resource: CUPTI, NVIDIA's performance counter
  • CUDA, kernels, blocks => construct concurrency & GPU contention => A spy
  • LSTM model training => 5 models: splitting iterations; recognizing long ops; recognizing other ops; inferring hyper-parameters; model correction

Questions

  • How is GPU shared between VMs? How to avoid different drivers interfereing with each other? (especially both are root).
  • Will these features(CUPTI) be available on GPUs in data centers?

SGXFUZZ: Efficiently Synthesizing Nested Structures for SGX Enclave Fuzzing

 October 27, 2022 at 5:01 pm

Paper

Abstract

Intel's Software Guard Extensions (SGX) provide a nonintrospectable trusted execution environment (TEE) to protect security-critical code from a potentially malicious OS. This protection can only be effective if the individual enclaves are secure, which is already challenging in regular software, and this becomes even more difficult for enclaves as the entire environment is potentially malicious. As such, many enclaves expose common vulnerabilities, e.g., memory corruption and SGXspecific vulnerabilities like null-pointer dereferences. While fuzzing is a popular technique to assess the security of software, dynamically analyzing enclaves is challenging as enclaves are meant to be non-introspectable. Further, they expect an allocated multi-pointer structure as input instead of a plain buffer.

In this paper, we present SGXFUZZ, a coverage-guided fuzzer that introduces a novel binary input structure synthesis method to expose enclave vulnerabilities even without source-code access. To obtain code coverage feedback from enclaves, we show how to extract enclave code from distribution formats. We also present an enclave runner that allows execution of the extracted enclave code as a user-space application at native speed, while emulating all relevant environment interactions of the enclave. We use this setup to fuzz enclaves using a state-of-the-art snapshot fuzzing engine that deploys our novel structure synthesis stage. This stage synthesizes multi-layer pointer structures and size fields incrementally on-the-fly based on fault signals. Furthermore, it matches the expected input format of the enclave without any prior knowledge. We evaluate our approach on 30 open- and closed-source enclaves and found a total of 79 new bugs and vulnerabilities.

Contributions

  • Enclave Dumping: Using the SGX driver, we completely extract executable enclave code on Windows and Linux in an automated way.
  • Enclave Runner: We show how to execute enclave code, including SGX-specific instructions, natively outside SGX to retrieve code coverage feedback.
  • Structure Synthesis: We present a method to synthesize multi-layer structures of pointers, arrays, and size fields using fuzzing.
  • Memory Location Havoc: To analyze whether enclaves are vulnerable to trust-boundary attacks, we develop a fuzzing stage that tries different memory locations for each of the pointers in the synthesized structure.
  • Fuzz Analysis: Utilizing the methods developed in this work, we analyze the security of 30 enclaves and found 79 new bugs and vulnerabilities. So far, a total of three CVEs were assigned and $13k in bug bounty were paid for the vulnerabilities we identified.
  • Limitations of Symbolic Execution: We analyze previous work on vulnerability detection using symbolic execution, discuss limitations, and extensively compare the results with SGXFUZZ.

I believe the most novel one in this paper is Structure Synthesis, which may also be related to another work.

Design

Towards Formal Verification of State Continuity for Enclave Programs

 October 11, 2022 at 10:54 pm

PRIDWEN: Universally Hardening SGX Programs via Load-Time Synthesis

 October 11, 2022 at 9:53 pm

PDF

WASM bin is instrumented according to hardware configuration detected by the Prober and a configuration file. The instrumented binary is then validated and can be attested by the remote user.

  • The validator works like a final binary verifier to check if the instrumentations exist as expected.
  • Attestation: software attestation in PRIDWEN

Contributions

  • The first platform-aware load-time synthesis framework for SGX programs
  • Attestable in-enclave Wasm instrumentation and compilation toolchain.
  • Extensive evaluation.

Safe, Untrusted Agents Using Proof-Carrying Code

 October 9, 2022 at 10:46 pm

Safe, Untrusted Agents Using Proof-Carrying Code

PCC Charactristics

  1. General
  2. Low-risk & automatic
  3. Efficient
  4. Don't need to trust code producer
  5. Flexible

Components

The code and annotations are sent to a parser, which generates IL to be executed in a symbolic evaluator. The symbolic evaluator derives a predicate and asks an untrusted prover to solve it. The proof is encoded in Edinburgh Logical Framework. A type of proof is type checked means proving the right predicate.

Gadgets

  • Invariant annotation is the most important one
  • Their security policy: safe read/write; execution time bounded

Trusted Computing Base TCB

 September 8, 2022 at 9:48 pm

Blog posts

What’s a Trusted Compute Base?

Early Documents

SPECIFICATION OF A TRUSTED COMPUTING BASE (TCB)

  • G. H. Nibaldi, 30 November 1979 PDF

A Trusted Computing Base (TCB) is the totality of access control mechanisms for an operating system.

A TCB is a hardware and softwere access control mechunism that establishes v protection environment to control the sharing of information in computer systems. A TCB is an implementation of a reference monitor, as defined in [Anderson 72), that controls when and how data is accessed.

Proof that the TCB will indeed enforce the relevant protection policy can only be provided through a fonrial, methodological approach to TCB design and verification... Because the TCB consists of all the security-related mechanisms, proof of its validity implies the remainder of the system will perform correctly with resWpct to the policy.

Reference Monitor

a TCB is an implementation cf a reference monitor.

  • complete mediation of access
  • self-protecting
  • verifiable

Minimizing the complexity of TCB software is a major factor in raising the confidence level that can be assigned to the protection mechanisms it provides.

...two general design goals to follow after identifying all security relevant operations for inclusion in the TCB are (a) to exclude from the TCB software any operations not strictly security-related so that one can focus attention on those that are, and (b) to make as full use as possible of protection features available in the hardware.

DEPARTMENT OF DEFENSE TRUSTED COMPUTER SYSTEM EVALUATION CRITERIA

  • DoD 5200.28 STD, l5 Aug 83 PDF

The heart of a trusted computer system is the Trusted Computing Base (TCB) which contains all of the elements of the system responsible for supporting the security policy and supporting the isolation of objects (code and data) on which the protection is based.

... In the interest of understandable and maintainable protection, a TCB should be as simple as possible consistent with the functions it has to perform. Thus, the TCB includes hardware, firmware, and software critical to protection and must be designed and implemented such that system elements excluded from it need not be trusted to maintain protection.

Trusted Computing Base (TCB) - The totality of protection mechanisms within a computer system – including hardware, firmware, and software – the combination of which is responsible for enforcing a security policy. A TCB consists of one or more components that together enforce a unified security policy over a product or system. The ability of a trusted computing base to correctly enforce a security policy depends solely on the mechanisms within the TCB and on the correct input by system administrative personnel of parameters (e.g., a user's clearance) related to the security policy.

Now the concept of TCB is applicable not only in OS but also embedded systems, and focuses on a security-critical portion of the system, including hardware and software.

Some system (Class A1) still requires a formal design specification and verification of TCB to ensure high degrees of assurance.

Authentication in Distributed Systems: Theory and Practice

Another important concept is the ‘trusted computing base’ or TCB [9], a small amount of software and hardware that security depends on and that we distinguish from a much larger amount that can misbehave without affecting security

Some weaknesses of the TCB model

S&P 1997 Paper

Authentication in Distributed Systems: Theory and Practice,

ACM Transactions on Computer Systems, 1992

It’s not quite true that components outside the TCB can fail without affecting security. Rather, the system should be ‘fail-secure’: if an untrusted component fails, the system may deny access it should have granted, but it won’t grant access it should have denied.

An Efficient TCB for a Generic Content Distribution System

2012 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discover PDF

The trusted computing base (TCB) [1] for a system is a small amount of hardware and/or software that need to be trusted in order to realize the desired assurances. More specifically, the assurances are guaranteed even if all elements outside the TCB misbehave.

The lower the complexity of the elements in the TCB, the lower is the ability to hide malicious/accidental functionality in the TCB components. Consequently, in the design of any security solution it is necessary to lower the complexity of components in the TCB to the extent feasible.

More Recent Study

TCB Minimizing Model of Computation (TMMC)

Bushra, Naila.  Mississippi State University ProQuest Dissertations Publishing,  2019. 27664004. Paper

Reducing TCB Complexity for Security-Sensitive Applications: Three Case Studies

EuroSys, 2006 PDF

The security requirements fall into four main categories: confidentiality, integrity, recoverability, and availability. For clarity, we present the definition of these terms.

  • Confidentiality: Only authorized users (entities, principals, etc.) can access information (data, programs, etc.).
  • Integrity: Either information is current, correct, and complete, or it is possible to detect that these properties do not hold.
  • Recoverability: Information that has been damaged can be recovered eventually.
  • Availability: Data is available when and where an authorized user needs it.

Justifications of Reducing TCB

Relationships between selected software measures and latent bug-density: Guidelines for improving quality

Paper

It seems that nearly all code size/complexity measurements contributes to bug density, except Method Hiding Factor and Polymorphism Factor.

This work just focuses on C++ programs. What about using a different language, e.g., Rust?

Choice of Language also Matters

Rust in the Android platform

Rust modernizes a range of other language aspects, which results in improved correctness of code:

Other Related Work

Œuf: Minimizing the Coq Extraction TCB

Reducing TCB complexity for security-sensitive applications: Three case studies